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The recent advent of "-omics" technologies have heralded a 
new era of personalized medicine. Personalized medicine is 
referred to as the ability to segment heterogeneous subsets of 
patients whose response to a therapeutic intervention within 
each subset is homogeneous. This new paradigm in healthcare 
is beginning to affect both research and clinical practice. The key 
to success in personalized medicine is to uncover molecular 
biomarkers that drive individual variability in clinical outcomes or 



drug responses. In this review, we begin with an overview of 
personalized medicine in breast cancer and illustrate the most 
encountered statistical approaches in the recent literature tailored 
for uncovering gene signatures. 
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INTRODUCTION 

Not all patients respond equally to cancer therapeutic com- 
pounds. Recent advances in high-throughput genomic, tran- 
scriptomic, and proteomic technologies with the ever-increas- 
ing understanding of the molecular mechanisms of cancers 
permit uncovering genes that harbor personal variations in 
clinical outcomes or drug responses. Personalized medicine 
has revolutionized the healthcare paradigm by integrating 
personal genetic information, improving the drug treatment 
efficacy, shifting the practice of medicine, and creating oppor- 
tunities to introduce new business and healthcare economic 
models. 

The traditional standard "one-dose-fits-all" approach to drug 
development and clinical therapy has been ineffective, as it 
incurs all risks of subsequent drug toxicities and treatment 
failures [1] . The percentage of patients for whom a major drug 
is effective is presented in Figure 1 [1]. With the great variability 
across diseases, 38% to 75% of patients fail to respond to a 
treatment. The average response rate of a cancer drug is the 
lowest at 25%. 
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Adverse drug reactions as a consequence of treatment are 
more of a problem. Among drugs approved in the U.S., 16% 
have shown adverse drug reactions [1]. A frequently cited 
meta-analysis revealed that 6.7% of all hospitalized patients 
are associated with adverse drug reactions in the U.S. and that 
the number of deaths exceeds 100,000 cases annually [2]. A 
study conducted in a major hospital identified 2,227 cases of 
adverse drug effects among hospitalized patients and reported 
that 50% of these cases are likely to be related to genetic factors 
[3]. 

Personalized medicine is the ability to segment heteroge- 
neous subsets of patients whose response to a therapeutic 
intervention within each subset is homogeneous [4] . Under 
this new healthcare paradigm, physicians can make optimal 
choices to maximize the likelihood of effective treatment and 
simultaneously avoid the risks of adverse drug reactions; 
scientists can improve the drug discovery process, and pharma- 
ceutical companies can manufacture medical devices to fore- 
cast patient prognosis, facilitating early disease detection. 

The ultimate goal of personalized medicine is to furnish the 
proper treatment to the right person at the right time [5]. The 
potential impact of personalized medicine is contingent upon 
a systematic discovery of a novel biomarker from genome-wide 
candidates that account for variations across individuals. This 
review begins with an overview of personalized medicine and 
illustrates the most encountered statistical approaches for 
uncovering biomarkers utilized in the recent literature. 
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Figure 1. Inefficacy of the one-dose-fits-all approach. This figure de- 
picts the percentage of patients for whom a major drug is effective on 
average. With the high variability across diseases, 38% to 75% of pa- 
tients fail to respond to a treatment. The average response rate of a 
cancer drug is the lowest at 25%, suggesting that 75% of patients with 
cancer are over-dosed and will potentially suffer from an adverse drug 
reaction. From Spear BB, et al. Trends Mol Med 2001 ;7:201 -4 [1]. 



DEFINITION OF PERSONALIZED MEDICINE: 
INDIVIDUALIZED TREATMENT VS. TREATMENT 
FOR A SUB-PATIENT GROUP 

Personalized medicine has been defined in many ways. 
According to the U.S. National Institutes of Health (NIH), 
personalized medicine is "an emerging practice of medicine 
that uses an individual's genetic profile to guide decisions 
made in regard to the prevention, diagnosis, and treatment of 
disease" [6]. The U.S. Food and Drug Administration defined 
personalized medicine as "the best medical outcomes by 
choosing treatments that work well with a persons genomic 
profile or with certain characteristics in the persons blood 
proteins or cell surface proteins" [7]. The Presidents Council 
of Advisors on Science and Technology (PCAST) described 
personalized medicine as "tailoring of medical treatment to 
the individual characteristics of each patient" [4]. 

It is important to recognize that personalized medicine 
does not literally mean individuality The idea of personalized 
medicine has often been exaggerated, as suggested in a head- 
line in Newsweek (June 10, 2005) "Medicine Tailored Just for 
You." In fact, a new treatment regimen is assessed on a group 
of carefully selected patients but not individuals [5]. As such, 
PCAST reports that personalized medicine is "the ability to 
classify individuals into subpopulations that differ in their 
susceptibility to a particular disease or their response to a 
specific treatment" [4] . If a new treatment works effectively on 
a sub-patient group, a preventive intervention can then be 



furnished to those who will benefit, avoiding adverse drug 
effects and sparing expense for those who will not. 

BIOMARKERS: PROGNOSTIC VS. PREDICTIVE 

A biomarker is a reliable and accurate measurement that in- 
dicates a normal biological process, a pathogenic process, or a 
pharmacological response to a therapeutic intervention [8]. 
With this broad and general definition, biomarkers include 
physiological measurements such as lung function, blood 
pressure or electroencephalography, molecular (DNA, protein, 
metabolite) or cellular measures from biofluids (blood, plasma, 
serum, and urine), molecular, cellular or histopathological 
measures from solid tissue samples, and measurements from 
magnetic resonance imaging or computed tomography images 
[9]. 

In this review, we will concentrate on "prognostic" and "pre- 
dictive" biomarkers that forecast patient outcomes. A prognos- 
tic biomarker is related with a patients clinical outcome and 
can be used to select patients for an adjuvant systemic treat- 
ment irrespective of the patient response to treatment, whereas 
a predictive biomarker is related to the patients response to a 
particular intervention. 

According to a U.S. NIH Consensus Conference, "a clinical 
useful prognostic biomarker must be a proven independent, 
significant factor that is easy to determine and interpret and 
that has therapeutic consequences" [10]. A prognostic bio- 
marker provides information about the patients overall cancer 
outcome irrespective of the therapeutic response [11]. There- 
fore, a prognostic biomarker can be exploited to select patients 
for an adjuvant systemic treatment but does not forecast the 
treatment response [6]. 

Decision making about adjuvant systemic treatment for 
breast cancer is usually based on nodal status [12-14], tumor 
size [15,16], tumor type/grade [17-20], lymphatic and vascular 
invasion [21,22], tumor hormone receptor and human epider- 
mal growth factor receptor 2 (YLER2)/neu status [23-26], age 
[27,28], and ethnicity [29-31]. Prognostic biomarkers that 
provide better information on relapse risk could prevent many 
patients from chemotherapy toxicity without compromising 
survival [32]. Significant prognostication of a biomarker needs 
to be demonstrated in prospective randomized clinical trials. 

In contrast, a predictive biomarker provides information 
about the effect of a therapeutic intervention [32]. In other 
words, a predictive biomarker enables screening of a subset 
of patients that are responsive to a specific therapy where 
response is defined by any of the clinical endpoints commonly 
measured in clinical trials [33]. As a predictive biomarker 
indicates heterogeneous benefits contingent upon sub-patient 
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risk groups classified by the status of the biomarker, a signifi- 
cant interaction between treatment effects and patient catego- 
ries needs to be statistically validated, ideally in a randomized 
clinical trial [34]. 

Predictive biomarkers can help physicians to forecast the 
effects of a particular treatment. Numerous proteins and genes 
exist that are specifically associated with breast cancer growth, 
proliferation, and metastasis. The deeper understanding of 
their roles regarding the responses of various therapies may 
empower physicians to determine optimal treatments for 
patients with breast cancer [35]. 

Some biomarkers are both prognostic and predictive (Table 
1) [36,37]. For example, patients with estrogen receptor (ER) 
and/or progesterone receptor (PR) -positive tumors have 
longer survival than those with hormone receptor-negative 
tumors [15,38]. Additionally, a recent randomized trial reported 
that high cellular ER and PR expression predicts the benefit 
from adjuvant tamoxifen [39]. 

As another example, ¥LER2/neu gene amplification, which 
leads to overexpression of its receptor on the cell membrane 
in approximately 30% of human breast tumors, is related with 
a worse prognosis in patients with node -positive breast cancer 



due to increased proliferation and angiogenesis and inhibition 
of apoptosis [23-26]. ¥LER2/neu is also the target for the 
monoclonal antibody trastuzumab from which patients with 
HER2/neu overexpressing tumors benefit in a metastatic and 
adjuvant setting [40-42]. 

WHY PERSONALIZED MEDICINE? 

The wide-ranging impacts and myriad opportunities pro- 
vided by personalized medicine can be summarized in refer- 
ence to its four major attributes [5]. 

Personalized 

Personalized medicine integrates personal genetic or protein 
profiles to strengthen healthcare at a more personalized level, 
particularly with the aid of recently emerging "-omic" tech- 
nologies such as nutritional genomics, pharmacogenomics, 
proteomics, and metabolomics [43]. Personalized medicine 
targets what has a positive effect on a patients disease and then 
develops safe and effective treatments for that specific disease 
[5]. In fact, genetic biomarkers that may be specifically associ- 
ated with a disease state are the foundation of personalized 



Table 1 . Personalized medicine drugs for breast cancer as of July 201 2 



Biomarker 



Drug 



Compound 



Indication 



BRCA1/2 

Estrogen receptor Selective estrogen Nolvadex® 
(hormone receptor) receptor modulators 



Fareston® 

Aromatase inhibitors Femara® 
Arimidex® 
Aromasin® 

Estrogen receptor Faslodex® 

antagonist 
mTOR inhibitor AFINITOR® 



HER2//7ew over- 
expression 
(HER2-positive) 



Monoclonal antibody Herceptin® 
Perjeta® 



Tyrosine kinase 
inhibitor 



Tykerb® 



Guides surveillance and preventive treatment based on susceptibility risk for breast and 
ovarian cancer 

Tamoxifen Tamoxifen is currently used for the treatment of estrogen receptor positive breast cancer 
in pre- and post-menopausal women. Additionally, it is the most common hormone 
treatment for male breast cancer. It is also approved by the FDA for the prevention of 
breast cancer in women at high risk of developing the disease 

Toremifen Toremifen is an estrogen agonist/antagonist indicated for the treatment of breast cancer in 
postmenopausal women with estrogen -receptor positive tumors 

Letrozole Letrozole is indicated for the treatment of postmenopausal women with hormone receptor- 
positive breast cancer 

Anastrozole Anastrozole is indicated for the treatment of postmenopausal women with hormone 

receptor-positive breast cancer 
Exemestane Exemestane is indicated for the treatment of postmenopausal women with hormone 

receptor-positive breast cancer 
Fulvestrant Fulvestrant is indicated for the treatment of hormone receptor positive metastatic breast 

cancer in postmenopausal women with disease progression following antiestrogen therapy 
Everolimus Everolimus is a mTOR inhibitor indicated for the treatment of postmenopausal women with 

advanced or metastatic hormone receptor-positive, HER2-negative breast cancer in 

combination with exemestane, after failure of treatment with letrozole or anastrozole 
Trastuzumab Trastuzumab is indicated for use in combination with cytotoxic chemotherapy for the 

treatment of breast cancer in women with HER2-positive tumor 
Pertuzumab Pertuzumab is indicated for use in combination with trastuzumab and docetaxel for the 

treatment of patients with HER2-positive metastatic breast cancer who have not received 

prior anti-HER2 therapy or chemotherapy for metastatic disease 
Lapatinib Lapatinib is indicated in combination with capecitabine for the treatment of patients with 

advanced or metastatic breast cancer whose tumors overexpress HER2 and who have 

received prior therapy including an anthracycline, a taxane, and trastuzumab 



Data from National Cancer Institute. Drug information: drugs approved for different types of cancer, http://www.cancer.gov/cancertopics/druginfo/drug-page-index [36], 
National Cancer Institute. Drug information: drugs approved for breast cancer, http://www.cancer.gov/cancertopics/druginfo/breastcancer [37], 
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medicine. Knowledge of a patients genetic profile leads to the 
proper medication or therapy so that physicians can manage a 
patients disease or predisposition towards it using the proper 
dose or treatment regimen [6]. 

Preventative 

Personalized medicine pursues not reaction but reaction. 
With the ability to forecast disease risk or presence before 
clinical symptoms appear, personalized medicine offers the 
opportunity to act on the disease through early intervention. 
In lieu of reacting to advanced stages of a disease, preventive 
intervention can be life-saving in many cases. For example, 
females with genetic mutations in the BRCA1 or BRCA2 genes 
have a higher chance of developing breast cancer compared to 
those in the general female population [44,45]. An accurate 
test of these breast cancer susceptibility genes can guide sur- 
veillance and preventive treatment based on objective risk 
measurements such as increased frequency of mammography, 
prophylactic surgery, and chemoprevention (Table 2) [46]. 

Predictive 

Personalized medicine enables physicians to select optimal 
therapies and avoid adverse drug reactions. Molecular diagnos- 
tic devices using predictive biomarkers provide valuable infor- 
mation regarding genetically defined subgroups of patients who 
would benefit from a specific therapy. For example, Oncotype 
DX® (Genomic Health, Redwood City, USA) uses a 16-gene 
signature to determine whether women with certain types of 
breast cancer are likely to benefit from chemotherapy [47-49]. 
MammaPrint® (Agendia, Amsterdam, the Netherlands) uses 
a 70-gene expression profile to assess the risk of distant metas- 
tasis in patients with early-stage breast cancer [50]. These 
complex diagnostic tests can be used to classify patients into 
subgroups to inform physicians whether patients would be 
treated successfully with hormone therapy alone or may require 
more aggressive chemotherapy treatment. 

Table 2. In vitro diagnostic devices for breast cancer as of July 201 2 
Treatment Diagnostic device Indication 



Participatory 

Personalized medicine would lead to an increase in patient 
adherence to treatment [51]. When personalized healthcare 
assures its effectiveness and can minimize adverse treatment 
effects sparing the expenses, patients will be more likely and 
willing to comply with their treatments. 

STATISTICAL STRATEGIES FOR UNCOVERING 
GENE SIGNATURES THAT PREDICT CLINICAL 
OUTCOMES AND DRUG RESPONSES 

The critical component to success in personalized medicine 
is to uncover gene signatures that drive individual variability 
in clinical outcomes or drug responses. A number of systematic 
approaches have been proposed to identify molecular finger- 
prints that are predictive of patient prognosis and response to 
cancer treatments. In this review, we focused on the most 
encountered methods for biomarker discovery: data-driven 
and knowledge-driven approaches. 

In the data-driven approach, biomarkers associated with 
tumor traits are objectively searched in genome-wide analysis 
using data-mining tools. Unbiased biomarker discovery is the 
merit of this approach. A downside is that gene signatures 
identified by the data-driven approach are often difficult to 
interpret due to limited knowledge about their biological 
functions. In contrast, the knowledge-driven approach attempts 
to select candidate genes using prior knowledge or surveying 
the literature for evidence of linkage to either cancer patho- 
logical processes or pathways important in drug responses. 
As such, genes that are unknown to be involved in a process 
cannot be included. 

The combination of the data-driven and knowledge-driven 
approach has been used to develop gene signatures [48]. Bio- 
marker discovery in genome- wide analysis is subject to the 
curse of dimensionality, i.e., the situation in which there are 
far more genomic variables than the number of samples [52]. 



Chemotherapy Mammostrat® An immunohistochemical multigene test to predict the risk of early recurrence for estrogen receptor 

positive postmenopausal patients who will receive endocrine therapy and are considering adjuvant 
chemotherapy, node negative, estrogen receptor 
MammaPrint® A microarray based in vitro test based on a 70-gene expression profile to assess a patient's risk for 

distant metastasis 

Oncotype DX® 21 -gene signature A diagnostic test based on a 1 6-gene signature (plus five reference genes) to assess the risk of recur- 
rence for estrogen receptor positive patients 

High-risk patients may require additional chemotherapy whereas hormone therapy may be sufficient 
for low-risk patients 

Compan DX® 31 -gene signature A diagnostic test based on a 31 gene panel to predict time to metastasis following initial surgery and 
biopsy 

Data from U.S. Food and Drug Administration. Drugs@FDA: FDA approved drug products, http://www.accessdata.fda.gov/scripts/cder/drugsatfda [46], 
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One way to deal with this issue is to use the knowledge-driven 
approach to reduce the number of candidate genes detected 
by an objective genome- wide search. 

As an illustration of the data-driven approach, recently 
proposed systematic data-driven approaches based on in 
vitro-generated predictive profiles using cell-line models entail 
five key technical steps: 1) data collection, 2) quality control, 3) 
identification of candidate gene biomarkers, 4) construction 
of a multivariate prediction model, and 5) independent valida- 
tion of the prediction model (Figure 2) [53-57]. 

Biomarker discovery begins by collecting molecular data in 
a drug response experiment. A large amount of genomic or 
genetic characteristics on cell-lines are experimentally deter- 
mined using high-throughput technologies. The drugs patterns 
of activity in cells are measured on a continuous (percent of 
cell survival or death) or discrete scale (responsive or resistant). 

The immediate procedure following acquisition of a large 
amount of molecular data is quality control or pre-processing. 
Due to the nature of high-throughput technologies that intro- 
duce inevitable non-biologic noises and biases during data 
collection, appropriate normalization according to specific 
array technologies is performed before further analysis. It is 
important to note that quality control can affect downstream 
data analysis. 

The subsequent step after assuring an adequate level of nor- 
malization is to identify the subset of genes that are candidate 
predictors highly associated with drug activities. This step 




/ Identification of 
V candidate biomarkers 



Step 4 




Figure 2. Schematic plot for a systematic statistical approach to identify 
predictive biomarkers. 



reduces the parameter space of gene variables in a very high 
dimension [41]. In the previous studies, various practical 
approaches have been used, including classical two-sample t- 
tests, variant t- tests [58-61], empirical Bayes methods [62-64], 
a linear mixed-effect model [65], the generalized likelihood 
ratio test [66] and the local-pooled-error test [67]. Note that 
these statistical approaches rely on underlying assumptions 
such as distributional specifications, exchangeability for a 
random-effect distribution, constant coefficients of variation, 
a mean-variance relationship, and others. 

Upon narrowing down candidate genes to a few hundred, a 
statistical classification modeling technique is then used to 
construct a multivariate prediction model. Single biomarkers 
are less likely to furnish sufficient sensitivity and specificity for 
most applications [35]. Several classification methods have 
been utilized, including a variant of linear discriminant analysis 
[68], support vector machines [69-71], Bayesian regression 
[72], partial least squares [73], principal component regression 
[74], and between-group analysis [75]. The performance of a 
statistical prediction model should be tested and assessed by 
various statistical measures such as classification error rate 
and area under the receiver operating characteristic curve, the 
product of posterior classification probabilities [76-78], and 
an index so-called the misclassification-penalized posterior 
[79]. The leave-one-out approach, random splitting, and boot- 
strapping are often employed for an internal cross validation. 
Additionally, multicenter validation is also performed for an 
external cross validation. It has been implied from previous 
studies that no one dominating classifier outperforms all other 
methods. 

Finally, the ultimate evidence of the usefulness of a predic- 
tion model in a clinical setting is randomized, prospective 
validation in a clinical trial [80]. After refinement and valida- 
tion in independent cohorts, the covariates in the prediction 
model can be used to develop assays that accurately predict 
prognosis and responses to chemotherapeutic agents, contrib- 
uting to the development of "personalized medicine" for 
patients with cancer. 

CONCLUSION 

Personalized medicine is receiving a large amount of grow- 
ing attention for its tremendous potential with myriad new 
opportunities. The ultimate promise of personalized medicine 
depends on the discovery of the personal genetic causes of 
disease. The remarkable advent of current high-throughput 
technologies in combination with improved knowledge of the 
molecular basis of malignancy provides a solid base for iden- 
tifying novel molecular targets. This revolutionized paradigm 
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Figure 3. Cost of sequencing a human-sized genome. Note that a log- 
arithmic scale is used on the Y axis. The cost of sequencing rapidly de- 
creased at an exponential rate from 2001 to 2007. The sudden drop in 
cost around January 2008 was due to sequencing technology geared 
up from the first generation ("Sanger-based" or dideoxy chain termina- 
tion sequencing) to the second generation (or "next-generation"). The 
cost of sequencing has dramatically decreased since 2008. From Wet- 
terstrand KA. DNA sequencing costs: data from the NHGRI large-scale 
genome sequencing program, http://www.genome.gov/sequencing- 
costs/[81]. 

in healthcare is already beginning to affect both research and 
clinical practice. 

The use of high-throughput technologies is expected to 
greatly increase in the next few years as the cost of technologies 
will continue to drop (Figure 3) [81]. Genomic sequencing 
and its interpretation will have to be further developed and 
standardized for routine clinical practice to develop efficient 
and effective methods for discovering and verifying new bio- 
markers and enabling personalized medicine technologies. In 
particular, efforts to standardize existing technologies will lead 
to more reproducible and robust identification of biomarkers. 

Several challenges must be overcome before this flood of 
profile data is successfully translated into clinical utilities for 
patients with breast cancer. Improved knowledge obtained 
using advanced profile technologies will not be sufficient for 
this purpose, but all stakeholders involved in personalized 
medicine should work together to take responsibility. Regula- 
tory authorities should provide clear guidelines for evaluating 
and approving newly developed personalized drugs and should 
validate the capabilities of the diagnostic devices that predict 
patient prognoses or drug responses. Medical educational 
institutions should prepare the next generation of physicians 
to use and interpret personal genetic information appropriately 
and responsibly. Finally, public and private insurers need to 
evaluate the clinical and economic utility of personalized drugs 
and devices to facilitate reimbursement. 
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